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Abstract. Many systems exhibit a digit bias. For example, the first digit base 10 of 
the Fibonacci numbers, or of 2", equals 1 not 10% or 11% of the time, as one would 
expect if all digits were equally likely, but about 30% of the time. This phenomenon, 
known as Benford's Law, has many applications, ranging from detecting tax fraud for 
the IRS to analyzing round-off errors in computer science. 

The central question is determining which data sets follow Benford's law. Inspired 
by natural processes such as particle decay, our work examines models for the decom- 
position of conserved quantities. We prove that in many instances the distribution of 
lengths of the resulting pieces converges to Benford behavior as the number of divisions 
grow. The main difficulty is that the resulting random variables are dependent, which 
we handle by a careful analysis of the dependencies and tools from Fourier analysis to 
obtain quantified convergence rates. 
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1.1. Background and Problem. The distribution of leading digits of numbers in data 
sets has a fascinating history, with numerous applications arising in very diverse fields. 
Base 10, the probability of observing a first digit of d is frequently not 1/9, as one 
would expect if all digits are equally likely, but rather log^^o (^7^)' leading to almost 
30% of the leading digits being a 1 . Though this bias was first observed by Newcomb 
UNewll in the 1 880s when he noticed that certain pages in tables of logarithms were more 
worn than others, the subject gained popularity with Benford's HBenll 1938 paper, where 
he studied sets ranging from mathematical functions to street addresses of 'famous' 
people. Today, Benford's law can be found in applied mathematics JBHU, auditing 
^DM[ IMN31 [NigT| [Nipl [Nig3l [NigMi], biol ogy I CLTFI, computer science IlKnill . 
dynamical systems [iBerTl lBer2l iBBHi iBHKRl iRodl . economics EM, geology UNMH . 
number theory ||ARS[|KonMillLSl , physics IIPTTVII , signal processing UPHAH . statistics 
[|MN2[|C04l and voting fraud IIMebi to name just a few. See [|BH2[ IHuH for extensive 
bibliographies, and llBH3l lBH4l iBHSl IDIal IJKKKMl |JRl IMNTI iPml [Rail for general 
surveys and explanations of the law's prevalence. 

One of the most important questions in the subject, as well as one of the hardest, is 
to determine which processes lead to Benford behavior. Many researchers HAdhl lASl 
iBhl IJKKKMi iLevTl lLev2l IMNTI iRobl [Sal [Ml [Scl [Sc3l [STl observed that sums, prod- 
ucts and in general arithmetic operations on random variables leads to more Benford 
behavior. Many of the proofs use techniques from measure theory and Fourier analysis, 
though in some special cases it is possible to obtain closed form expressions for the 
densities, which can be analyzed directly. 

A crucial input in the above arguments is that the random variables are independent. 
In this paper, we explore situations where there are dependencies. Our motivating exam- 
ple is due to Lemons ULemH . who proposed studying the decomposition of a conserved 
quantity (for example, what happens during certain types of particle decay). As the sum 
of the piece sizes must equal the original number, the resulting summands are clearly 
dependent. While it is frequently easy to show that individual pieces are Benford, the 
difficulty is in handling all the pieces simultaneously. 

Lemons models this process and offers it as evidence for the prevalence of Benford 
behavior, arguing that many sets that exhibit Benford behavior are merely the breaking 
down of some conserved quantity. However, Lemons is not completely mathematically 
rigorous in his analysis of the model (which he states in the paper), and glosses over 
several important technical points. We briefly mention some issues. 

The first concerns the constituent pieces. He assumes the set of possible piece sizes is 
bounded above and below and is drawn from a finite set, eventually specializing to the 
case where the sizes are in a simple arithmetic progression (corresponding to a uniform 
spacing), and then taking a limit to assume the pieces are drawn from a continuous 
range. In this paper, we allow our piece lengths to be drawn continuously from intervals 
at the outset, and not just in the limit. This removes some, but by no means all, of 
the technical complications. One must always be careful in replacing discrete systems 
with continuous ones, especially as there can be number-theoretic restrictions on which 
discrete systems have a solution. Modeling any conserved quantity is already quite 
hard with the restriction that the sum of all parts must total to the original starting 
amount; if the pieces are forced to be integers then certain number theoretic issues 
arise. For example, imagine our pieces are of length 2, 4 or 6, so we are trying to solve 
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2xi + 4x2 + 6x3 = n. There are no solutions if n = 2011, but there are 84,840 if n is 
2012. By considering a continuous system from the start, we avoid these Diophantine 
complications. We will return to the corresponding discrete model in a sequel paper 

hbgmrsl 

Another issue is that it is unclear how the initial piece breaks up. The process is 
not described explicitly, and it is unclear how likely some pieces are relative to others. 
Finally, while he advances heuristics to determine the means of various quantities, there 
is no analysis of variances or correlations. This means that, though it may seem unlikely, 
the averaged behavior could be close to Benford while most partitions would be far from 
BenfordS 

These are important issues, and their resolution and model choice impacts the behav- 
ior of the system. We mention a result from Miller-Nigrini IIMN2L where they prove 
that while order statistics are close to Benford's law (base 10), they do not converge in 
general to Benford behavior. In particular, this means that if we choose N points ran- 
domly on a stick of length L and use those points to partition our rod, the distribution 
of the resulting piece sizes will not be Benford. Motivated by this result and Lemons' 
paper, we instead consider a model where at stage N we have 2^ sticks of varying 
lengths, and each stick is broken into two smaller sticks by making a cut on it. Each cut 
is chosen from some random variable K on (0, 1) representing the fraction of the stick's 
length at which we make the cut. Explicitly, if we start with one stick of length L, after 
one iteration we have sticks of length LKi and L(l — Ki), and after two iterations we 
have sticks of length LK1K2, LKi{l - K2), L{1 - Ki)K^, and L(l - Ki){l - K^). 
Iterating this process times, we are left with 2^ sticks coming from 2^ — 1 ran- 
dom variables Ki, . . . , K2n_i, with lengths ranging from Xi = LK1K2KS ■ ■ ■ K2N-1 
to X2N = L(l - Ki){l - K3){1 -K7) ■■■{!- K2N_i) (see Figure[I]). Clearly the X/s 

are not independent, as i^^^i Xi = L. 

Fix an N and perform the division on M ^ N sticks of length L, recording the 
observed lengths. If we let M tend to infinity while N is fixed, and then let N go to 
infinity, we can easily show the distribution of all the lengths converges to Benford's 
law. The interesting case is to show that we also get Benford's law (or, more precisely, 
with high probability we are very close to Benford's law) if we take M = 1; in other 
words, we have a large number of divisions of one stick. 



1.2. Notation. To quantify the first digits distribution of the set {Xi}f^j^, we introduce 
some notation. More generally, instead of studying just the first digits we could look 
at the significand (called the mantissa in some sources). Recall any positive number x 
can be written as Sio{x)10''^^\ where Sio{x) E [1, 10) is the significand. Two numbers 
have the same leading digits if and only if their significands are equal. 

A sequence {?/„} is equidistributed modulo 1 if for any [a, b] C [0, 1] we have 
limAr_5.oo ■ ^ N : Xn E [a,b]} = b — a. Benford's law (in a stronger form 
for the distribution of the significands) is equivalent to the base- 10 logarithms of the 

^Imagine we toss a coin one million times, always getting either all heads or all tails. Let's say these 
two outcomes are equally likely. If we were to perform this process trillions of times, the total number of 
heads and tails would be close to each other; however, no individual experiment would be close to 50%. 
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L|1-A'|)(1-A-,]A:7 


L(l-K|)(l-A',)(l-A',) 



Figure 1 . Breaking L into pieces, N = 3. 



elements in the data sets being equidistributed modulo 1 (see nOial IMT^ for a proof). 
To have a first digit of d, we need the significand in [logio logio('i +1)); note the 
length of this interval is \og^Q{d + 1) — \og^Q{d) = log^Q and we recover Benford's 
law for the distribution of the first digit. 
For s e [1, 10), let 



1 if the significand of u is at most s 
otherwise; 



(1.1) 



thus (fs is the indicator function of the event of a significand at most s. 

While in the proofs we concentrate on the special case where each cut is chosen 
uniformly on its interval, we can consider the more general process of the cuts be- 
ing identically distributed but non-uniform, or even the case where different cuts are 
taken from different distributions. We need to be a bit careful; while typically products 
of independent random variables converge to Benford behavior, there are pathological 
choices where this fails (see Example 2.4 of HMNIH ). It is convenient to phrase the 
needed conditions in terms of the Mellin transform. 

Let f{x) be a continuous real-valued function on [0, oo) (though in our applications, 
as our random variables are cuts represented as a percentage of an interval, / will be a 
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probability density defined on [0, 1]). We define its Mellin transform, (A^/)(s), by 

/"°° fir 

iMf)is) := / fix)x^ — . (1.2) 
Jo X 

Note {M.f){s) = E[x^~^], and thus results about expected values translate to results on 
Mellin transforms; for example, since / is a density we have (A^/)(l) = 1. If we let 
X = e^™ and s = a-i^, then {Mf){a-i^) = 2it (/(e^^je^'^'"") e-^™^^^, which 
is the Fourier transform of g(u) = 27r/(e^™)e^'^'^". The Mellin and Fourier transforms 
as thus related; in fact, it is this logarithmic change of variables which explains why 
both enter into Benford's law problems. For proofs of the Mellin transform properties 
one can therefore just mimic the proofs of the corresponding statements for the Fourier 
transform; good references are IISS11ISS21 . 

Finally, we occasionally use big-Oh and little-oh notation. We write /(x) = 0{g{x)) 
(or equivalently f{x) <^ g{x)) if there exists an xq and a C > such that, for all 
X > xo, < C g{x), while f{x) = o{g{x)) means \im^^^ f{x)/ g{x) = 0. 



1 .3. Results. Our main result is the following. 

Theorem 1.1 (Limiting behavior of decompositions). Fix a continuous probability den- 
sity f on [0, 1] such that 

En(>"')0-i^)-o. (1.3, 

i=-oo m=l ^ / 



where h{x) is either f{x) or /(I — x) (the density of 1 — K if K has density f). 
Given a stick of length L, choose independent identically distributed random variables 
Ki, K2, . . . , K2N_i with density f and divide the stick as follows: 

• Divide L into LKi and L{1 — Ki). 

• Divide LKi into LK1K2 and LKi{l — K2), and L[l — Ki) into L(l — Ki)K^, 

L(l-Jii)(l-i^3). 

• Continue cutting each piece in two, obtaining after N iterations 

Xl = LK1K2K4 • • • K2N-2K2N~1 

X2 = LKIK2K4 ■ ■ ■ L2N-2 (1 - K2 



iiV-1 



X2N_^ = L{l-K^){l-K^){l~K-j)---{l-K2N-.)K2N^^ 
X2N = L{l-Ki){l-Ks){l-K7)---{l-K2N-i)K2N_,. (1.4) 

Define 

which is the fraction of partition pieces Xi, . . . , X2N whose significand is less than or 
equal to s (see (|1. II) /or the definition of (ps)- Then 



(1) lim E[P^(s)] =logios. 
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(2) lim Var(Pjv(s)) = 0. 

A''->-oo 

Thus as N ^ oo, the decomposition process is Benford. 

Note that Part 1 of this theorem states that in the limit of having more and more 
copies of this process and the number of levels tending to infinity, the amalgamation 
converges to Benford's Law; Part 2 says we may consider just one process so long as 
the number of levels tends to infinity. The proof uses results from probability, analysis, 
and combinatorics, and is given in ^ A crucial input is a quantified convergence 
of products of independent random variables to Benford behavior, with the error term 
depending on the Mellin transform. We use Theorem 1 . 1 (and its generalization, given 
in Remark 2.3) of HJKKKM H: for the convenience of the reader we quickly review this 
result and its proof in Appendix lAl Unlike Part 1, the dependencies of the pieces is 
a major obstruction; we surmount this by breaking the pairs into groups depending on 
how dependent they are (specifically, how many cuts they share). 

Remark 1.2. We briefly remark on the key condition in the theorem, namely that the 
density satisfles (II. 3I ). This is an extremely weak condition, and is met by most distribu- 
tions. For example, if f is the uniform density X[o,i] on [0, 1], then 



2ml 
log 10 



1 - 



log 10 



-1 



(1.6) 



(this is also true replacing X[o,i]{^) ^^^h Xlo,i]i 
on [0, 1]), which implies 



X] 



as these densities are the same 



lim 



N 



£=-oo m=l 
f/0 



log 10 



2 lim V 



1 - 



2Txii 



log 10 



0. (1.7) 



We wrote the condition as nm=i(-^/) instead of (J^f)^ to highlight where the changes 
would surface if we allowed different densities for different cuts. This can be done by 
using methods from Jang, Kang, Kruckman, Kudo and Miller HJKKKM H. which we dis- 
cuss in Appendix^ 



Remark 1.3. In Theorem \l.l\ we assumed for simplicity that at each stage each piece 
must split into exactly two pieces. Modifying the proof, one can show Benford behav- 
ior is also attained in the limit if at each stage each piece independently splits into 
0,1,2,... or k pieces with probabilities po,pi,p2, ■ ■ ■ ,Pk ( the case above is p2 = I). 



2. Proof of Main Result 

We now turn to the proof of Theorem ll.li We first prove that the first digit distribution 
of the given decomposition model is Benford on average and in the limit (Part 1 of the 
theorem). We then prove that this average first-digit distribution will, in fact, almost 
surely be the outcome of the model in the limit (Part 2 of the theorem). 
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2.1. Proof of Parti. 

Proof of Theorem \n\ 1 ). By the linearity of the expectation operator, we have 



E[P^(s)] = E 



2 

(2.1) 



}N ^ _ 

= 1 

We recall that all pieces Xi can be expressed as the product of the starting length 
L and N independent random variables taking on values in (0, 1). While there are 
dependencies among the X/s, there are no dependencies among the Ki's. For a given 
i, there will be some number, say Mi, of cuts with factors drawn from K, and N — Mi 
cuts with factors drawn from 1 — K (where K is a random variable on [0, 1] with density 
/). By relabeling if necessary, we may assume 

X, = LK,K, ■ ■ ■ - Km, J ■ ■ ■ (1 - i^^); (2.2) 

the first Mi random variables have density f(x) and the last N — Mi have / (I — x). 
The proof is completed by showing limAr^oo lE[0s(Xj)] = log^g ^- We have 

„1 „i „i / M, N 

E[^,{Xi)] = / ■■■ / ^sIluk n ( 

Mi N 

■ llf{kr)dK n fi^-kp)dK- (2-3) 

r=l p=Mi+l 

This is equivalent to studying the distribution of a product of N independent random 
variables and then rescaling the result by L. By the Pidgeon-hole Principle, we have 
at least N/2 random variables with factors kr and density f{kr) or at least N/2 with 
factors I — kp and density /(I — kp). The convergence to Benford now follows from 
results of Jang, Kang, Kruckman, Kudo and Miller HJKKKM H (which are summarized 
for the reader's convenience in Appendix lAl). The key observation is to note that the 
Mellin transform at 1 — -^^^ is strictly less than 1 in absolute value for continuous 
densities if £ 7^ (which is seen by trivially inserting absolute values in the definition 
of the Mellin transform). 

Explicitly, from Appendix lAl we find E[0s(Xj)] equals logio ^ pl^s a rapidly decaying 
A^-dependent error term for all probability densities that satisfy the condition in the 
statement of the Theorem, which includes the uniform distribution. We may take the 
error to be independent of Mi (in other words, we can obtain a bound that holds for all 
decompositions simultaneously). We can do this by noting the Mellin transforms we 
have (with i ^ 0) are always less than 1 in absolute value. Thus the error is bounded by 
the maximum of the error from a product with A^/2 terms with density f(x) or a product 
with A^/2 terms with density /(I — x). Thus lim E[P/v(s)] = login s, completing the 

proof. □ 

Remark 2.1. For specific choices of f we can obtain precise bounds on the error. For 
example, if each cut is chosen uniformly on (0, 1), then the density of Ki and 1 — Ki is 
the same. By Corollary \A.2\ 

E[0,(X,)]-logioS « (2.4) 
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and thus 



2N 

nPNis)] - log.o - « ^ E ^ = ^ (2.5) 

i=l 



(we may use 10 for the big-Oh constant above). 



2.2. Proof of Part 2. 



Proof of Theorem ITJ\ 2 1 For ease of exposition, we assume all the cuts are drawn from 
the uniform distribution on (0, 1). To facilitate doing the minor changes needed for the 
general case, we argue as generally as possible for as long as possible. 

We begin by noting that since (psiXi) is either or 1, (^^(Xj)^ = (^s{Xi). From this 
observation, and the definition of variance and the linearity of the expectation operator, 
we have 



Var(P,v(s)) = E[P^(s) V E[P7v(s)]' 



= E 



E 



2N 



02N 



- EI-Pk(s)]- 



22N 



E[P^(s)]' 



/ 2™ 

^E [p^{s)] + nMx,)Mx,)] 



\ 



-E[P^(s)]2.(2.6) 



FromTheoremlLHl), E[Pjv(s)] 
to zero as — 7- oo. Thus 



\og^QS + o{l); here o(l) means an error that tends 



Var(P^(s)) 



1 



-log?os + o(l). (2.7) 



The problem is now reduced to evaluating the cross terms over all i j. This is 
the hardest part of the analysis, and it is not feasible to evaluate the resulting integrals 
directly. Instead, for each i we partition the pairs {Xi,Xj) based on how 'close' Xj 
is to Xi in our tree (see Figure [I])- We do this as follows. Recall that each of the 2^ 
pieces is a product of the starting length L and N random variables between and 1. 
Writing any {Xi,Xj) pair in this form, it is clear that they must share some number 
of these random variables, say M terms. After M stages, the pieces Xj and Xj split, 
such that one piece has a factor Km+i in its product, while the other contains the factor 
(1 — Km+i)- The remaining N — M — 1 elements in each product are independent 
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from one another. After re-labeling, we can thus express any Xi, Xj pair, without loss 
of generalitjH as: 

Xi = L ■ Ki ■ K2 - ■ ■ Km ■ Km+i ■ Km+2 ■ ■ ■ 

X, = L-K,-K2---Km-{1-Km+i)-Km+2---Kn. (2.8) 

With these definitions in mind, and again denoting the probability density function 
from which these random variables are drawn as f{k) and f(l — k), for a fixed i, j pair 
we have 

M+l N 




A/+2-U ^K]v-u y r=M+2 



E[ips{X,)ip.,{X,)] = I I ■■■/ / ■ I Vs[Ll[kr H K 

';2=0 Jkj^=oJkj^j 
' M 

Ll[kr-{l-k, 



"-M+l J 




N N 



• ]^/(^r) Y\. fO- - K)dkidk2- ■ ■ dkNdkM+2- ■ ■ dkN.{2.9) 

r=l r=M+2 

In the statement above we have integrated over the remaining 2^ — TV — (A^ — M — 1) = 
2^ — 2N + M + 1 variables; since those variables do not appear in any of the cuts in 
Xi or Xj, their corresponding integrals are 1. 

The difficulty in understanding (12.91 ) is that many variables occur in both (^s{Xi) 
and i^siXj). The key observation is that most of the time there are many variables 
occurring in one but not the other, which minimizes the effects of the common variables 
and essentially leads to evaluating ips at almost independent arguments. We make this 
precise below, keeping track of the errors. 

We can study the behavior of the integral in (12.91 ) as a function of the significand of 
the first M + 1 random variables. More specifically, we define the functions 

h{Li) := ■■■/ V^s I ^1 JJ j JJ f{kr)dkM+idkM+2- ■ -dki 

JkM+i=0 JkN=0 \ r=AI+l J r=M+l 



12(^2) := / ■•■/ V^^i Iv2 J]^ /Cr J]^ f{l - kr)dkM+idkM+2- ■ -dkN, 



Af+1-0 uKM-y^ \ r=M+l / r=M+l 



(2.10) 



which are defined for all Li, L2 E [1, 10). We will show that, for any Li,L2, we 
have |/(Li)/(L2) — (log^os)^! = o(l). Once we have this, then all that remains is 
to integrate /(Li)/(L2) over the remaining M variables (ki, . . . , /cj,/), which will be 
(log^Q s)^ + 0(1). The rest of the proof follows from counting, for a given i, how many 
j lead to a given M. 

It is at this point where we require the assumption about /(x) from the statement 
of the theorem, namely that f(x) and /(I — x) satisfy (11.31 ). For illustrative purposes, 
we assume that each cut K is drawn from a uniform distribution, meaning f(x) and 



^Looking at Figure [U we see that the labeling given below cannot be right (for example, K2 and 
are in different branches); however, it is convenient to relabel the indices of the random variables. 
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/(I — x) are the probability density function associated with the uniform distribution 
on [0, 1]. The argument can readily be generalized to other distributions; we choose to 
highlight the uniform case as it is simpler, important, and we can obtain a very explicit, 
good bound on the error. 

Both I{Li) and /(i^2) involve integrals over N — M + 1 variables; we set n := 
N — M — 1. For the case of a uniform distribution, equation (3.7) of HJKKKMII (or see 
Corollary IA.2I) gives for n > 4 that! 

\h{L,)- log,, s\ < (^-L + ^M_1^21og,oS, (2.11) 

Note that for all choices of Li, Ii{Li) E [0, 1], and for n < 4 we may simply bound the 
difference by 1. It is also important to note that for n > 1, ({n) — 1 is O (1/2"), and 
thus the error term decays very rapidly. 

Replacing Li with L2 yields a similar bound for 12(1^2). As such, we can choose a 
constant C such that 

C 

|/i(Li) - logiosi < 



2.9^^ 

- \og,,s\ < ^ (2.12) 

for all n, Li, L2. Because of this rapid decay, by the triangle inequality it follows that 

2C 

|/i(Li)- 12(1^2) -(logio^fl < (2.13) 

For each of the 2^ choices of i, and for each 1 < n < A^, there are 2""^ choices of j 
such that Xj has exactly n factors not in common with Xj. We can therefore obtain an 
upper bound for the sum of the expectation cross terms by summing the bound obtained 
for 2"-i/i(Li) ■ 12(^2) over all n and all i: 



J2 {n^s{X.Ws{X,)]-\ogl,s) 
Substituting this into equation (|2.7|) yields 



2^ N 

< $^5^2"-!^ < 2^-4^(2.14) 



i=l n=l 



4C 

Var(P;v(s)) < ^ + 0(1). (2.15) 



Since the variance must be non-negative by definition, it follows that lim Var {Pn{s) ) = 

0, completing the proof if each cut is drawn from a uniform distribution. The more 
general case follows analogously, appealing to Theorem lA.ll with non-identically dis- 
tributed random variables (though there are only two densities). □ 



Our situation is slightly different as we multiply the product by Li; however, all this does is translate 
the distribution of the logarithms by a fixed amount, and hence the error bounds are preserved. 
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Appendix A. Convergence of Products to Benford 

As this paper crucially builds on the fact that many products of independent random 
variables are Benford, we quickly sketch a proof of this result. The arguments below 
are a condensed version of H JKKKMII . The main change is that in HJKKKM H the re- 
sults were stated for chains of random variables, whereas here we give the equivalent 
formulation for products. 



A.l. Preliminaries. Before giving the proof, we review some notation and needed 
properties of the Mellin transform. The density of the random variable Si ■ H2 (the 
product of two independent random variables with density / and cumulative distribution 
function F) is 

f/(f)/.)f 

(the generalization to more products is straightforward). To see this, we first calculate 
the probability that Si ■ S2 G [0, x] and then differentiate with respect to x. Thus 

Prob(Si ■ S2 G [0, x]) = / Prob (^2 G [o, ^1 ) fit)dt 

Jt=0 ^ L t J / 

= £/{j)fit)dt. (A.2) 
Differentiating gives the density of Hi ■ H2, which equals 

the factor of 1/t actually facilitates the upcoming analysis. 

If g{s) is an analytic function for 3fJ(s) G (a, b) such that g{c + iy) tends to zero uni- 
formly as \y\ — 7- 00 for any c G (a, b), then the inverse Mellin transform, {Ai^^g){x), 
is given by 

{M-'g){x) = — / gis)x-'ds (A.4) 

(provided that the integral converges absolutely). If we set g{s) = {M.f){s) then 
/(x) = {M.~^g){x). We define the convolution of two functions /i and /2 by 

The Mellin convolution theorem states that 

{M{h ^ f2)){s) = {Mh){s) ■ (Mm-s), (A.6) 
which by induction gives 

{M{h^---^fN))is) = iMfN){s)---{MfN)isy, (A.7) 
note fi* ■ ■ ■ -k /n is the density of the product of N random variables. 
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A. 2. Proof of Product Result. We now sketch the proof that products of independent, 
identically distributed random variables converge to Benford, and isolate out the error 
term. The description below is modified from HJKKKMII with permission. 



Theorem A.l (Jang, Kang, Kruckman, Kudo and Miller [IJKKKMII ). Let Si, . . . , S^v 

be independent random variables with densities Assume 



oo N , 

t=-ac m = \ ^ 



log 5 



0. 



(A.8) 



Then as n ^ oo, = Si ■ ■ ■ Sat converges to Benford' s law. In particular, ifY^ 
logs X]y then 



|Prob(F7v mod 1 e [a,b]) - {b - a)\ 

oo N 



< (b-a) 



i=-oo m=l 



1 - 



27iii 
log 5 



(A.9) 



Proof. To investigate the distribution of the digits of = Si ■ ■ ■ Sat (base B) it's 
convenient to make a logarithmic change of variables, setting = log^ X^. We have 



Prob(F,v < y) 



Prob(X^ < 



Fn{B^ 



(A. 10) 



where /at is the density of X^ and Fn is the cumulative distribution function. Taking 
the derivative gives the density of Y^, which we denote hy gN{y): 



g^iv) = fN{By)By log B. 



(A.ll) 



A standard method to show X^- tends to Benford behavior is to show that Y^ mod 1 
tends to the uniform distribution on [0, 1] (see for example HDiai IMT^ ). This can be 
seen from the following calculation. The key ingredient is Poisson Summation. Let 

hN,y{t) = gN{y + t). Then 



Yl ^^.s'(^) = J2 ^^'y^^^ = J2 e'"*''^ivW, (A.12) 



l=-oo 



l=-oc 



l=-oc 



l=-oc 



where / denotes the Fourier transform of /: 



X. 



(A.13) 
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Letting [a,b] C [0, 1], we see that 



Prob(YAr mod 1 G [a,6]) = ^ 9N{y)dy 

9N{y + i)dy 



l=~oo 
b oo 



J2 e'^'''9NWy 

=— oo 

b-a + Eii({b-a)Y,\9Nm\ (A.14) 



where Err [z] means an error at most z in absolute value. Note that since is a 
probability density, ^Ar(O) = 1. The proof is completed by showing that the sum over i. 
tends to zero as n — cxd. We thus need to compute 'gN{^)'- 



9n{0 = I 9N{y)e-''''y^dy 

fN{By)By log B -e-^^'y^dy 



-oo 
oo 



-oo 
oo 



(Mfn 

N 

\(Mf^ W 1 - 

m=l ^ 

Substituting completes the proof. □ 

We isolate the result in the special case that all cuts are drawn from the uniform 
distribution on (0, 1). The error term below depends on the value of the Riemann zeta 
function 

oo 

C(^) = E- iMs)>l), (A.16) 

n=l 

at positive integers. As C(^) — 1 ^ the error term below is essentially 1/2.9^ 

for large. 

Corollary A.2 (Products of Independent Uniform Random Variables). Let Hi, . . . , S^v 

be N independent random variables that are uniformly distributed on (0, 1), and let 
Sig^(s) be the probability that the significand ofEi ■ ■ ■ H^v (base 10) is at most s. For 
N > Awe have 

|Sig^(.)-logios| < ('-l^ + ^M^')21ogioS. (A.17) 
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